Scalable Alignment Kernels via Space-Efficient Feature Maps

نویسندگان

  • Yasuo Tabei
  • Yoshihiro Yamanishi
  • Rasmus Pagh
چکیده

String kernels are attractive data analysis tools for analyzing string data. Among them, alignment kernels are known for their high prediction accuracies in string classifications when tested in combination with SVMs in various applications. However, alignment kernels have a crucial drawback in that they scale poorly due to their quadratic computation complexity in the number of input strings, which limits large-scale applications in practice. We present the first approximation named ESP+SFM for alignment kernels leveraging a metric embedding named edit-sensitive parsing (ESP) and space-efficient featuremaps (SFM) for randomFourier features (RFF) for large-scale string analyses. Input strings are projected into vectors of RFF leveraging ESP and SFM. Then, SVMs are trained on the projected vectors, which enables to significantly improve the scalability of alignment kernels while preserving their prediction accuracies. We experimentally test ESP+SFM on its ability to learn SVMs for large-scale string classifications with various massive string data, andwe demonstrate the superior performance of ESP+SFM with respect to prediction accuracy, scalability and computation efficiency. ACM Reference Format: Yasuo Tabei, Yoshihiro Yamanishi, and Rasmus Pagh. 2018. Scalable Alignment Kernels via Space-Efficient Feature Maps. In Proceedings of the 24th ACMSIGKDDConference onKnowledgeDiscovery andData Mining (KDD’18). ACM, New York, NY, USA, Article 4, 10 pages. https://doi.org/10.475/123_4

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

Spherical Structured Feature Maps for Kernel Approximation

We propose Spherical Structured Feature (SSF) maps to approximate shift and rotation invariant kernels as well as b-order arc-cosine kernels (Cho & Saul, 2009). We construct SSF maps based on the point set on d − 1 dimensional sphere Sd−1. We prove that the inner product of SSF maps are unbiased estimates for above kernels if asymptotically uniformly distributed point set on Sd−1 is given. Acco...

متن کامل

An efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network

Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...

متن کامل

Speeding up Training with Tree Kernels for Node Relation Labeling

We present a method for speeding up the calculation of tree kernels during training. The calculation of tree kernels is still heavy even with efficient dynamic programming (DP) procedures. Our method maps trees into a small feature space where the inner product, which can be calculated much faster, yields the same value as the tree kernel for most tree pairs. The training is sped up by using th...

متن کامل

Operator-Valued Bochner Theorem, Fourier Feature Maps for Operator-Valued Kernels, and Vector-Valued Learning

This paper presents a framework for computing random operator-valued feature maps for operator-valued positive definite kernels. This is a generalization of the random Fourier features for scalar-valued kernels to the operator-valued case. Our general setting is that of operator-valued kernels corresponding to RKHS of functions with values in a Hilbert space. We show that in general, for a give...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.06382  شماره 

صفحات  -

تاریخ انتشار 2018